Stochastic gradient descent

(Stochastic gradient descent (SGD))

Efficient offline optimization method for functions $f$ with finite sum structure: $f(\mathbf{x})=\sum_{i=1}^n f_i(\mathbf{x})$ Goal is to find $\hat{\mathbf{x}}$ such that

Gradient descent

#incomplete

References: 1. 2. https://web.stanford.edu/class/ee270/scribes/lecture16.pdf